824 research outputs found

    The tangent FFT

    Get PDF
    The split-radix FFT computes a size-n complex DFT, when n is a large power of 2, using just arithmetic operations on real numbers. This operation count was first announced in 1968, stood unchallenged for more than thirty years, and was widely believed to be best possible. Recently James Van Buskirk posted software demonstrating that the split-radix FFT is not optimal. Van Buskirk’s software computes a size-n complex DFT using only arithmetic operations on real numbers. There are now three papers attempting to explain the improvement from 4 to 34/9: Johnson and Frigo, IEEE Transactions on Signal Processing, 2007; Lundy and Van Buskirk, Computing, 2007; and this paper. This paper presents the "tangent FFT," a straightforward in-place cache-friendly DFT algorithm having exactly the same operation counts as Van Buskirk’s algorithm. This paper expresses the tangent FFT as a sequence of standard polynomial operations, and pinpoints how the tangent FFT saves time compared to the split-radix FFT. This description is helpful not only for understanding and analyzing Van Buskirk’s improvement but also for minimizing the memory-access costs of the FFT

    Faster binary-field multiplication and faster binary-field MACs

    Get PDF
    This paper shows how to securely authenticate messages using just 29 bit operations per authenticated bit, plus a constant overhead per message. The authenticator is a standard type of "universal" hash function providing information-theoretic security; what is new is computing this type of hash function at very high speed. At a lower level, this paper shows how to multiply two elements of a field of size 2^128 using just 9062 \approx 71 * 128 bit operations, and how to multiply two elements of a field of size 2^256 using just 22164 \approx 87 * 256 bit operations. This performance relies on a new representation of field elements and new FFT-based multiplication techniques. This paper's constant-time software uses just 1.89 Core 2 cycles per byte to authenticate very long messages. On a Sandy Bridge it takes 1.43 cycles per byte, without using Intel's PCLMULQDQ polynomial-multiplication hardware. This is much faster than the speed records for constant-time implementations of GHASH without PCLMULQDQ (over 10 cycles/byte), even faster than Intel's best Sandy Bridge implementation of GHASH with PCLMULQDQ (1.79 cycles/byte), and almost as fast as state-of-the-art 128-bit prime-field MACs using Intel's integer-multiplication hardware (around 1 cycle/byte). Keywords: Performance, FFTs, Polynomial multiplication, Universal hashing, Message authenticatio

    The new SHA-3 software shootout

    Get PDF
    § 1. Introduction This paper introduces a new graphing mechanism to allow easy comparison of software performance of the SHA-3 candidates. The new mechanism concisely captures a large amount of performance data without oversimplifying the data. We have integrated this graphing mechanism into our eBASH (ECRYPT Benchmark- ing of All Submitted Hashes) project. New graphs are automatically posted at the top of http://bench.cr.yp.to/results-sha3.html whenever the eBASH performance results are updated. This paper includes snapshots of these graphs, but readers are advised to check the web page for the latest updates. See http://bench.cr.yp.to for more information regarding eBASH. For each function there is also a similar graph online comparing implementations of that function, showing in a concise way which implementations are slow or non-functional. Im- plementors can follow links from http://bench.cr.yp.to/primitives-sha3.html to find these graphs. Of course, users concerned about performance will reject slower implementa- tions in favor of faster implementations, so the main shootout graphs re ect only the fastest implementations

    Efficient arithmetic on elliptic curves in characteristic 2

    No full text
    International audienceWe present normal forms for elliptic curves over a field of characteristic 2 analogous to Edwards normal form, and determine bases of addition laws, which provide strikingly simple expressions for the group law. We deduce efficient algorithms for point addition and scalar multiplication on these forms. The resulting algorithms apply to any elliptic curve over a field of characteristic 2 with a 4-torsion point, via an isomorphism with one of the normal forms. We deduce algorithms for duplication in time 2M+5S+2mc2M + 5S + 2m_c and for addition of points in time 7M+2S7M + 2S, where MM is the cost of multiplication, SS the cost of squaring , and mcm_c the cost of multiplication by a constant. By a study of the Kummer curves K=E/{±1]}\mathcal{K} = E/\{\pm1]\}, we develop an algorithm for scalar multiplication with point recovery which computes the multiple of a point P with 4M+4S+2mc+mt4M + 4S + 2m_c + m_t per bit where mtm_t is multiplication by a constant that depends on PP

    Using LDGM Codes and Sparse Syndromes to Achieve Digital Signatures

    Full text link
    In this paper, we address the problem of achieving efficient code-based digital signatures with small public keys. The solution we propose exploits sparse syndromes and randomly designed low-density generator matrix codes. Based on our evaluations, the proposed scheme is able to outperform existing solutions, permitting to achieve considerable security levels with very small public keys.Comment: 16 pages. The final publication is available at springerlink.co

    Elligator : elliptic-curve points indistinguishable from uniform random strings

    Get PDF
    Censorship-circumvention tools are in an arms race against censors. The censors study all traffic passing into and out of their controlled sphere, and try to disable censorship-circumvention tools without completely shutting down the Internet. Tools aim to shape their traffic patterns to match unblocked programs, so that simple traffic profiling cannot identify the tools within a reasonable number of traces; the censors respond by deploying rewalls with increasingly sophisticated deep-packet inspection. Cryptography hides patterns in user data but does not evade censorship if the censor can recognize patterns in the cryptography itself. In particular, elliptic-curve cryptography often transmits points on known elliptic curves, and those points are easily distinguishable from uniform random strings of bits. This paper introduces high-security high-speed elliptic-curve systems in which elliptic-curve points are encoded so as to be indistinguishable from uniform random strings. At a lower level, this paper introduces a new bijection between strings and about half of all curve points; this bijection is applicable to every odd-characteristic elliptic curve with a point of order 2, except for curves of j-invariant 1728. This paper also presents guidelines to construct, and two examples of, secure curves suitable for these encodings

    Kummer strikes back : new DH speed records

    Get PDF
    This paper introduces high-security constant-time variable-base-point Diffie--Hellman software using just 274593 Cortex-A8 cycles, 91460 Sandy Bridge cycles, 90896 Ivy Bridge cycles, or 72220 Haswell cycles. The only higher speed appearing in the literature for any of these platforms is a claim of 60000 Haswell cycles for unpublished software performing arithmetic on a binary elliptic curve. The new speeds rely on a synergy between (1) state-of-the-art formulas for genus-2 hyperelliptic curves and (2) a modern trend towards vectorization in CPUs. The paper introduces several new techniques for efficient vectorization of Kummer-surface computations. Keywords: implementation / performance, Diffie--Hellman, hyperelliptic curves, Kummer surfaces, vectorizatio

    Optimizing double-base elliptic-curve single-scalar multiplication

    Get PDF
    This paper analyzes the best speeds that can be obtained for single-scalar multiplication with variable base point by combining a huge range of options: • many choices of coordinate systems and formulas for individual group operations, including new formulas for tripling on Edwards curves; • double-base chains with many different doubling/tripling ratios, including standard base-2 chains as an extreme case; • many precomputation strategies, going beyond Dimitrov, Imbert, Mishra (Asiacrypt 2005) and Doche and Imbert (Indocrypt 2006). The analysis takes account of speedups such as S – M tradeoffs and includes recent advances such as inverted Edwards coordinates. The main conclusions are as follows. Optimized precomputations and triplings save time for single-scalar multiplication in Jacobian coordinates, Hessian curves, and tripling-oriented Doche/Icart/Kohel curves. However, even faster single-scalar multiplication is possible in Jacobi intersections, Edwards curves, extended Jacobi-quartic coordinates, and inverted Edwards coordinates, thanks to extremely fast doublings and additions; there is no evidence that double-base chains are worthwhile for the fastest curves. Inverted Edwards coordinates are the speed leader

    Elliptic Curve Scalar Multiplication Combining Yao’s Algorithm and Double Bases

    Full text link
    Abstract. In this paper we propose to take one step back in the use of double base number systems for elliptic curve point scalar multiplication. Using a mod-ified version of Yao’s algorithm, we go back from the popular double base chain representation to a more general double base system. Instead of representing an integer k as Pn i=1 2 bi3ti where (bi) and (ti) are two decreasing sequences, we only set a maximum value for both of them. Then, we analyze the efficiency of our new method using different bases and optimal parameters. In particular, we pro-pose for the first time a binary/Zeckendorf representation for integers, providing interesting results. Finally, we provide a comprehensive comparison to state-of-the-art methods, including a large variety of curve shapes and latest point addition formulae speed-ups

    What fraction of stars formed in infrared galaxies at high redshift?

    Full text link
    Star formation happens in two types of environment: ultraviolet-bright starbursts (like 30 Doradus and HII galaxies at low redshift and Lyman-break galaxies at high redshift) and infrared-bright dust-enshrouded regions (which may be moderately star-forming like Orion in the Galaxy or extreme like the core of Arp 220). In this work I will estimate how many of the stars in the local Universe formed in each type of environment, using observations of star-forming galaxies at all redshifts at different wavelengths and of the evolution of the field galaxy population.Comment: 7 pages, 0 figs, to appear in proceedings of "Starbursts - From 30 Doradus to Lyman break galaxies", edited by Richard de Grijs and Rosa M. Gonzalez Delgado, published by Kluwe
    • …
    corecore